Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DENSE_RANK aggregate function #1484

Conversation

claesrosell
Copy link
Contributor

Implementation and unit tests of DENSE_RANK. I could not decide if I should change the file format version or not, I ended up not doing it.

Implementation and unit tests.
Moved back to JavaSE-11
@merks
Copy link
Contributor

merks commented Nov 4, 2023

For future reference, you can modify the Bundle-RequiredExecutionEnvironment in the MANIFEST.MF and save. Then on the context menu of the plug-in project use Plug-in Tools -> Update Classpath... which will update the .classpath and the org.eclipse.jdt.core.prefs to match that value.

@speckyspooky speckyspooky added this to the 4.14 milestone Nov 4, 2023
@speckyspooky speckyspooky added the Enhancement Small change to improve the current supported functionality label Nov 4, 2023
@claesrosell
Copy link
Contributor Author

@hvbtup I suspect that you have a lot on your plate. Still, I would like you to test this out since you suggested this new aggregation function and have a lot more knowledge about report design than me.

Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works well if the expression is NOT NULL.
But the report fails with

java.lang.ClassCastException: class org.eclipse.birt.data.aggregation.impl.rank.NullObject cannot be cast to class java.lang.Comparable

if the expression can be null.

For example, the report crashes if I use the following query

select 
   PRODUCTCODE
 , PRODUCTNAME
 , PRODUCTLINE
 , PRODUCTSCALE
 , PRODUCTVENDOR
 , PRODUCTDESCRIPTION
 , QUANTITYINSTOCK
 , BUYPRICE
 , MSRP
 , case when productname like '%f%' then null else buyprice end as x
from products

and then use RANK or DENSERANK based on the column X.

A minor issue (but my German picky eyes stumble over it...) is the non-consistent spelling.
There are

  • DENSERANK (the visible form, which is OK, because it is in line with eg. RUNNINGCOUNT)
  • TOTAL_DENSE__RANK_FUNC (with two underscores before RANK)
  • DENSE_RANK

Fixed NPE and did some cleaning
Fixed name inconsistency of DENSERANK
@claesrosell
Copy link
Contributor Author

With this latest commit everything should be addressed. I did some more clean-up to the RankObjComparator, added generic types to it.

@claesrosell claesrosell requested a review from hvbtup November 6, 2023 21:04
Copy link
Contributor

@hvbtup hvbtup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's looking and working much better now!

I think it's worth noting in the docs (if we only had good docs) how the BIRT aggregate functions RANK and DENSERANK treat NULL values:

They are considered to be less than all NON NULL values.

Thus with ascending order, NULLs come first, and with descending order, NULLs come last.

This behavior is the same as with the RANK aggregate in previous releases, that's good.

People with a strong DB-specific background (like me) may be surprised, though.
In Oracle, ORDER BY expr [ASC|DESC] NULLS LAST will put NULLs at the end independent of ascending or descending ordering. This SQL syntax is standard, but the default is vendor-specific. In Oracle, NULLS LAST is the default; to get the nulls first one has to specify NULLs FIRST explicitly.

However, this is vendor-dependent (see https://modern-sql.com/concept/null).

We might add explicit control over NULL values later.
A workaround (in SQL) is shown on the modern-sql page.
That principle can be used with BIRT as well by adding a boolean expression that tells us if the column is null or not; then we can first sort by this column and then by the column value itself.

@wimjongman wimjongman linked an issue Nov 7, 2023 that may be closed by this pull request
@claesrosell
Copy link
Contributor Author

Today there are no way to pass parameters down to Aggregator functions / Accumulators. If we were to make that possible it would benefit #1089 as well. I might look into that one as my next thing

@claesrosell claesrosell merged commit 3954093 into eclipse-birt:master Nov 7, 2023
1 check passed
@claesrosell claesrosell deleted the 981-add-a-dense_rank-aggregate-function-like-in-sql branch November 7, 2023 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Small change to improve the current supported functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a DENSE_RANK aggregate function (like in SQL)
4 participants